skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Christensen, Martin P"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. EDBT (Ed.)
    In data lakes, one of the core challenges remains finding rele- vant tables. We introduce the notion of semantic data lakes, i.e., repositories where datasets are linked to concepts and entities described in a knowledge graph (KG). We formalize the problem of semantic table search, i.e., retrieving tables containing informa- tion semantically related to a given set of entities, and provide the first formal definition of semantic relatedness of a dataset to tuples of entities. Our solution offers the first general framework to compute the semantic relevance of the contents of a table w.r.t. entity tuples, as well as efficient algorithms (exploiting seman- tic signals, such as entity types and embeddings) to scale the semantic search to repositories with hundreds of thousands of distinct tables. Our extensive experiments on both real-world and synthetic benchmarks show that our approach is able to retrieve more relevant tables (up to 5.4 times higher recall) in comparison to existing methods while ensuring fast response times (up to 17 times faster with LSH). 
    more » « less